Sequence discriminative training for deep learning based acoustic keyword spotting
نویسندگان
چکیده
منابع مشابه
Keyword-based Discriminative Training of Acoustic Models1
In this paper, we investigate a new discriminative training technique which focuses on optimizing a keyword error rate, rather than the error rate on all words. We hypothesize that improvements in keyword error rate correlate with improvements in understanding error rates. Keyword-based discriminative training is accomplished by modifying a standard minimum classification error (MCE) training a...
متن کاملDiscriminative keyword spotting
This paper proposes a new approach for keyword spotting, which is not based on HMMs. The proposed method employs a new discriminative learning procedure, in which the learning phase aims at maximizing the area under the ROC curve, as this quantity is the most common measure to evaluate keyword spotters. The keyword spotter we devise is based on nonlinearly mapping the input acoustic representat...
متن کاملDeep Residual Learning for Small-Footprint Keyword Spotting
We explore the application of deep residual learning and dilated convolutions to the keyword spotting task, using the recently-released Google Speech Commands Dataset as our benchmark. Our best residual network (ResNet) implementation significantly outperforms Google’s previous convolutional neural networks in terms of accuracy. By varying model depth and width, we can achieve compact models th...
متن کاملTransferable Deep Features for Keyword Spotting
Deep features, defined as the activations of hidden layers of a neural network, have given promising results applied to various vision tasks. In this paper, we explore the usefulness and transferability of deep features, applied in the context of the problem of keyword spotting (KWS). We use a state-ofthe-art deep convolutional network to extract deep features. The optimal parameters concerning...
متن کاملSequence-discriminative training of deep neural networks
Sequence-discriminative training of deep neural networks (DNNs) is investigated on a 300 hour American English conversational telephone speech task. Different sequencediscriminative criteria — maximum mutual information (MMI), minimum phone error (MPE), state-level minimum Bayes risk (sMBR), and boosted MMI — are compared. Two different heuristics are investigated to improve the performance of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Speech Communication
سال: 2018
ISSN: 0167-6393
DOI: 10.1016/j.specom.2018.08.001